author:
- A
- B
submission:
- PMIR
- B
year: "2024"
file:
related:
tags: []
review date: 2025-01-03
Summary
"This paper investigates fully test-time adaptation for object detection. It means to update a trained object detector on a single testing image before making a prediction, without access to the training data."
소스/타겟 데이터셋 없이 단일 테스트 이미지에서 객체 감지 모델을 적응시키는 완전 테스트 타임 도메인 적응 방법 제안 [Page 1, Abstract]
객체 감지 모델이 도메인 변화에 취약한 문제를 해결하기 위한 새로운 테스트 타임 적응 프레임워크 제안 [Page 1, Section 1]
Self-training 기반 베이스라인에서 도메인 변화로 인한 pseudo label의 품질 저하 문제 발견 [Page 3, Section 3.2.2]
IoU 기반의 두 가지 지표로 pseudo label의 품질을 향상시켜 도메인 적응 성능 개선 [Page 4, Section 3.3]
실제 응용에서는 타겟 도메인이 이미지마다 다르고 알 수 없어, 기존 도메인 적응 방법들의 한계 존재 [Page 2, Section 1]
단일 테스트 이미지에서 반복적으로 모델을 업데이트하는 self-training 베이스라인 제안 [Page 3, Section 3.2.1]
도메인 변화로 인한 낮은 품질의 pseudo label 문제를 진단적 연구를 통해 발견 [Page 3-4, Section 3.2.2]
IoU-CI와 IoU-OD 두 지표로 구성된 IoU Filter로 pseudo label 품질 향상 [Page 4-5, Section 3.3]
5개 데이터셋 실험에서 기존 방법 대비 우수한 성능 입증 [Page 5-7, Section 4]
"The real world is complex and non-stationary, which is unlikely to be covered by any fixed dataset. The detector must adapt itself on the fly to the unknown and varying domain shift at test time."
도메인 적응 관련 선행 연구
"Though deep learning approaches have drastically pushed forward the state-of-the-art object detection performance on standard benchmarks, current object detectors are often vulnerable to domain shifts between the training data and testing images, e.g., unseen styles, weather, lighting conditions, and noise."
"Both UDA and SFDA assume that the target domain is known and fixed and that a target dataset sampled from this domain is available for training. However, the real world is complex and non-stationary, which is unlikely to be covered by any fixed dataset."
"We have two observations.
First, the baseline consistently improves the performance of the original detector. This demonstrates the potential of the self-training framework in our task.
Second, in most scenarios, using detection confidence to select pseudo labels leads to similar performance as using all detections as pseudo labels. Meanwhile, Fig. 2 shows that the pseudo labels are noisy even at a high confidence threshold."
2-1. 단일 이미지로부터 효과적인 도메인 적응 방법 설계 [Page 2, Section 1]
"It will facilitate many applications, e.g., image understanding systems for social media and visually impaired people, where the target domain differs from image to image, hence adaptation can be learned only from one sample."
2-2. Pseudo label의 품질 향상 방안 [Page 4, Section 3.3]
"Through a diagnostic study of a baseline self-training framework, we show that a great challenge of this task is the unreliability of pseudo labels caused by domain shift. We propose a simple yet effective method, i.e., IoU Filter, to address this challenge."
2-3. 반복적 업데이트 과정에서의 안정성 확보 [Page 7, Section 4.3.3]
"It delineates that all these methods improve at the first 5 or 6 iterations, but degrade in more iterations and would continue this trend in the future. This could be attributed to two reasons.
First, as there is only one testing image to perform adaptation, too many iterations could lead to overfitting.
Second, detection errors could accumulate in the pseudo labels and adversely affect the test-time training."
"It is an iterative algorithm. At the i-th iteration, the current detector makes a prediction on I, where we then collect confident detections as pseudo labels.
The Current detector θt−1 makes a prediction Dt = {(bt,i, pt,i) : ∀i} on I, where bt,i is the bounding box of the ith object instance and pt,i ∈ [0, 1]K is the probability distribution of the K classes."
1. 반복적 예측 및 학습 과정
- t번째 iteration에서 현재 detector θt-1로 예측 수행
- 예측 결과 Dt = {(bt,i, pt,i)}로 표현
- bt,i: i번째 객체의 bounding box
- pt,i: K개 클래스에 대한 확률 분포
2. Pseudo Label 생성
- Confidence threshold λconf 기준으로 필터링
- Pt = {(bt,i, yt,i) : ct,i > λconf}
- ct,i: detection confidence
- yt,i: 예측된 객체 클래스
3. 모델 업데이트
- Gradient descent로 현재 모델 θt-1 업데이트
- Pseudo label을 학습에 활용하여 θt 생성
"The IoU Filter consists of two new IoU-based indicators that are complementary to the detection confidence... Our statistical results indicate that both indicators increase the percentage of correct pseudo labels."
Methods
A. IoU between Consecutive Iterations (IoU-CI):
"For every object instance in Dt, we match it to an instance in Dt−1 with the same class and minimum IoU... The IoU-CI score of an instance in Dt is defined as the IoU between itself and its matched instance in Dt−1."
B. IoU between Overlapped Detections (IoU-OD):
"If ci is the highest among them, instance i passes the IoU-OD filter, otherwise, it is excluded from the pseudo labels. The IoU-OD filter is like class-agnostic non-maximum suppression (NMS)."
학습 설정 / 모델 구조
5개 데이터셋에서 평가 (Clipart1k, Comic2k, Watercolor2k, Foggy Cityscapes, Rainy Cityscapes)
"For all five testing datasets, the detection confidence threshold is set as 0.6, the IoU-CI threshold is set as 0.6, and the IoU-OD threshold is set as 0.9."
CoTTA 등 최신 방법들과 비교하여 우수한 성능 입증
Ablation study를 통해 각 컴포넌트의 효과 분석